6 research outputs found

    The use of lexical tone in the segmentation of speech

    No full text
    IntroductionIn speech, there are no blank spaces to signal boundaries between words as there is in written language, but listeners can nevertheless recognise individual words rapidly. Without these blank spaces or commas, listeners have to divide up – segment – the continuous speech stream into discrete words using other means. This study aimed to investigate the tonal cues important for speech segmentation in Swedish. We know that that different languages use different cues in speech segmentation, such as stress (Norris, McQueen, & Cutler, 1995), syllable weight (Cutler & Norris, 1988) and vowel harmony (Suomi, McQueen, & Cutler, 1997), but we do not yet know the extent to which phonological cues are used in speech segmentation. In English, stressed and metrically strong syllables are heard as more reliable word onsets, leading the parser to initiate a lexical access attempt at these points. Accurate segmentation is crucial since words can always be embedded in larger words, and these spurious embedded words are activated in memory (Luce & Cluff, 1998): the phrase start writing potentially includes star, trite, try, rye and so on (Cutler, 2012). However, no study has yet investigated speech segmentation in languages like Swedish, where prosody systematically combines with morphology. This will allow us to more fully understand universal drivers behind speech segmentation.In Swedish, every word or word stem has a lexical tone known as a word accent, in addition to stress. In Central Swedish, this tone is either low (accent 1) or high (accent 2). All monosyllabic words have accent 1, and the majority of polysyllabic words – such as compounds – have accent 2 on the word stem, especially trochees. There is also an interaction between prosody and morphology, so that stem word accent is also determined by suffixation: the word stem bĂ„t (‘boat’) has accent 1 preceding the singular suffix -en (bĂ„t1-en) but accent 2 preceding the plural suffix -ar (bĂ„t2-ar). With regard to word embeddings, a frequent accent 2 word with a plural suffix like möten2 (‘meetings’) potentially contains mö (‘maiden’) and tenn (‘tin’), and the accent 2 on the word stem ensures it can also be heard as the compound mö-tenn (‘maiden tin’). However, the string möten1 with accent 1 can only be heard as two words, as in the phrase möt en ko (‘meet a cow’). Accent 2 has thus been proposed to be ‘connective’ (Elert, 1970; Malmberg, 1959): it signals that more syllables will follow, belonging to the same lexical item. A string with accent 2 can thus always contain other words, perhaps more so than accent 1, which might make it more difficult to segment – especially in the case of monosyllabic targets – than accent 1 strings.This study used a word spotting paradigm to investigate the segmentation of Swedish words embedded in non-word frames to determine how prosody and syllable structure interact to affect word spotting performance.MethodsNative speakers of Swedish listened to auditory stimuli – trisyllabic non-word frames – recorded by a native speaker of Central Swedish. They were asked to press a button when they heard a Swedish word at the beginning of a string, entering the word using the computer keyboard. Each participant heard 15 monosyllabic target words embedded in accent 1 frames (bal-Ă€di1 ‘ball’), 15 monosyllabic words in accent 2 frames (bal-Ă€di2), 15 disyllabic words in accent 2 frames (bagge-pi2 ‘ram’) and 15 disyllabic words in accent 1 frames (bagge-pi1). All target items were matched for word frequency. Word accent pairs were counterbalanced across subjects. There were 60 fillers, containing no possible Swedish words. For response times, only trials where participants spotted and typed in the correct word were included, whereas all trials were included in the accuracy analysis.Data analysis and resultsResponse times were analysed using a generalised linear mixed-effects model with an inverse Gaussian function and identity link using the lme4 package in R (Bates, MĂ€chler, Bolker, & Walker, 2015). Word accent and number of target syllables were included as deviation-coded fixed effects with participant and item as random effects. The fastest response times were found for disyllabic words (e.g. bagge) in accent 2 frames, significantly faster than for monosyllabic words (e.g. bal) in accent 2 frames. Response accuracy was analysed using an identical model structure to response times but using a binomial function and logit link. An interaction between accent and number of target syllables showed that disyllabic words were spotted more successfully than monosyllabic words in accent 2 frames. DiscussionMonosyllabic targets were more difficult to spot in accent 2 strings, as shown by both response time and accuracy. This can possibly be explained by the fact that accent 2 strings can always contain other words, slowing down speech segmentation and recognition. It is also possible that the word accent triggers inappropriate syllabification, so that bal in bal-Ă€di2 is heard as the non-word ba (*ba-lĂ€di), similarly to strong syllables signalling a segmentation point and prompting syllabification in English (Cutler & Norris, 1988).ReferencesBates, D., MĂ€chler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1). doi:10.18637/jss.v067.i01Cutler, A. (2012). Native Listening: Language Experience and the Recognition of Spoken Words: The MIT Press.Cutler, A., & Norris, D. (1988). The Role of Strong Syllables in Segmentation for Lexical Access. Journal of Experimental Psychology-Human Perception and Performance, 14(1), 113-121. doi:10.1037/0096-1523.14.1.113Elert, C.-C. (1970). Ljud och ord i svenskan. Stockholm: Almqvist & Wiksell.Luce, P. A., & Cluff, M. S. (1998). Delayed commitment in spoken word recognition: Evidence from cross-modal priming. Perception & Psychophysics, 60(3), 484-490. doi:10.3758/Bf03206868Malmberg, B. (1959). Bemerkungen zum schwedischen Wortakzent. Zeitschrift fĂŒr Phonetik, 12, 193–207. Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and Segmentation in Spoken-Word Recognition. Journal of Experimental Psychology-Learning Memory and Cognition, 21(5), 1209-1228. doi:10.1037/0278-7393.21.5.1209Suomi, K., McQueen, J. M., & Cutler, A. (1997). Vowel Harmony and Speech Segmentation in Finnish. Journal of Memory and Language, 36(3), 422-444. doi:10.1006/jmla.1996.249

    The role of tone in Swedish speech segmentation

    No full text
    Introduction. In speech, there are no blank spaces to signal boundaries between words as there is in written language, but listeners can nevertheless recognise individual words rapidly. Without these blank spaces or commas, listeners have to divide up – segment – the continuous speech stream into discrete words using other means. This study aimed to investigate the tonal cues important for speech segmentation in Swedish. We know that that different languages use different cues in speech segmentation, such as stress (Norris, McQueen, & Cutler, 1995), syllable weight (Cutler & Norris, 1988) and vowel harmony (Suomi, McQueen, & Cutler, 1997), but we do not yet know the extent to which phonological cues are used in speech segmentation. In English, stressed and metrically strong syllables are heard as more reliable word onsets, leading the parser to initiate a lexical access attempt at these points. Accurate segmentation is crucial since words can always be embedded in larger words, and these spurious embedded words are activated in memory (Luce & Cluff, 1998): the phrase start writing potentially includes star, trite, try, rye and so on (Cutler, 2012). However, no study has yet investigated speech segmentation in languages like Swedish, where prosody systematically combines with morphology. This will allow us to more fully understand universal drivers behind speech segmentation

    The time course of onset CV coarticulation

    No full text
    The study investigates the center of gravity in onset fricatives as a main acoustic feature to assess the relation between vowel pronunciation and coarticulatory spectral characteristics of the onset consonant. /s/- and /f/-initial CV sequences were analyzed with backness, roundedness and height of the vowel as predictors of fricative center of gravity. Results showed that the first 15 ms of an onset fricative could carry predictive cues to the upcoming vowel

    Pre-activation negativity in language brain potentials

    No full text
    The pre-activation negativity (PrAN) is an event-related potential (ERP) component indexing how constraining phonological cues are. It has an early phase (136-200 ms), with sources in the left auditory cortices, and a late phase (200 ms onwards), with sources in Broca’s area. The PrAN has been found for segmental and prosodic cues increasing the certainty about upcoming words, morphemes, grammatical structures, or lexicality. The phonological cues investigated have been Central Swedish, South Swedish, Danish, and English segmental phonemes, Central and South Swedish lexical tone accents, Danish stþd, and Central Swedish boundary tones and left-edge boundary tones/initiality accents
    corecore